Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data

نویسندگان

Jianxin Shi

Ju-Hyun Park

Jubao Duan

Sonja T Berndt

Winton Moy

Kai Yu

Lei Song

William Wheeler

Xing Hua

Debra Silverman

Montserrat Garcia-Closas

Chao Agnes Hsiung

Jonine D Figueroa

Victoria K Cortessis

Núria Malats

Margaret R Karagas

Paolo Vineis

I-Shou Chang

Dongxin Lin

Baosen Zhou

Adeline Seow

Keitaro Matsuo

Yun-Chul Hong

Neil E Caporaso

Brian Wolpin

Eric Jacobs

Gloria M Petersen

Alison P Klein

Donghui Li

Harvey Risch

Alan R Sanders

Li Hsu

Robert E Schoen

Hermann Brenner

Rachael Stolzenberg-Solomon

Pablo Gejman

Qing Lan

Nathaniel Rothman

Laufey T Amundadottir

Maria Teresa Landi

Douglas F Levinson

Stephen J Chanock

Nilanjan Chatterjee

چکیده

Recent heritability analyses have indicated that genome-wide association studies (GWAS) have the potential to improve genetic risk prediction for complex diseases based on polygenic risk score (PRS), a simple modelling technique that can be implemented using summary-level data from the discovery samples. We herein propose modifications to improve the performance of PRS. We introduce threshold-dependent winner's-curse adjustments for marginal association coefficients that are used to weight the single-nucleotide polymorphisms (SNPs) in PRS. Further, as a way to incorporate external functional/annotation knowledge that could identify subsets of SNPs highly enriched for associations, we propose variable thresholds for SNPs selection. We applied our methods to GWAS summary-level data of 14 complex diseases. Across all diseases, a simple winner's curse correction uniformly led to enhancement of performance of the models, whereas incorporation of functional SNPs was beneficial only for selected diseases. Compared to the standard PRS algorithm, the proposed methods in combination led to notable gain in efficiency (25-50% increase in the prediction R2) for 5 of 14 diseases. As an example, for GWAS of type 2 diabetes, winner's curse correction improved prediction R2 from 2.29% based on the standard PRS to 3.10% (P = 0.0017) and incorporating functional annotation data further improved R2 to 3.53% (P = 2×10-5). Our simulation studies illustrate why differential treatment of certain categories of functional SNPs, even when shown to be highly enriched for GWAS-heritability, does not lead to proportionate improvement in genetic risk-prediction because of non-uniform linkage disequilibrium structure.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies

Recently genome-wide association studies (GWAS) have identified numerous susceptibility variants for complex diseases. In this study we proposed several approaches to estimate the total number of variants underlying these diseases. We assume that the variance explained by genetic markers (Vg) follow an exponential distribution, which is justified by previous studies on theories of adaptation. O...

متن کامل

Association between polygenic risk for schizophrenia, neurocognition and social cognition across development

Breakthroughs in genomics have begun to unravel the genetic architecture of schizophrenia risk, providing methods for quantifying schizophrenia polygenic risk based on common genetic variants. Our objective in the current study was to understand the relationship between schizophrenia genetic risk variants and neurocognitive development in healthy individuals. We first used combined genomic and ...

متن کامل

Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

To date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication s...

متن کامل

Quantifying and correcting for the winner's curse in quantitative-trait association studies.

Quantitative traits (QT) are an important focus of human genetic studies both because of interest in the traits themselves and because of their role as risk factors for many human diseases. For large-scale QT association studies including genome-wide association studies, investigators usually focus on genetic loci showing significant evidence for SNP-QT association, and genetic effect size tend...

متن کامل

Explicit Modeling of Ancestry Improves Polygenic Risk Scores and BLUP Prediction.

Polygenic prediction using genome-wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could impro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12 شماره

صفحات -

تاریخ انتشار 2016

Winner's Curse Correction and Variable Thresholding Improve Performance of Polygenic Risk Modeling Based on Genome-Wide Association Study Summary-Level Data

نویسندگان

چکیده

منابع مشابه

Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies

Association between polygenic risk for schizophrenia, neurocognition and social cognition across development

Illustrating, Quantifying, and Correcting for Bias in Post-hoc Analysis of Gene-Based Rare Variant Tests of Association

Quantifying and correcting for the winner's curse in quantitative-trait association studies.

Explicit Modeling of Ancestry Improves Polygenic Risk Scores and BLUP Prediction.

عنوان ژورنال:

اشتراک گذاری